Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
1.
EMBO J ; 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565951

RESUMO

A great deal of work has revealed, in structural detail, the components of the preinitiation complex (PIC) machinery required for initiation of mRNA gene transcription by RNA polymerase II (Pol II). However, less-well understood are the in vivo PIC assembly pathways and their kinetics, an understanding of which is vital for determining how rates of in vivo RNA synthesis are established. We used competition ChIP in budding yeast to obtain genome-scale estimates of the residence times for five general transcription factors (GTFs): TBP, TFIIA, TFIIB, TFIIE and TFIIF. While many GTF-chromatin interactions were short-lived ( < 1 min), there were numerous interactions with residence times in the range of several minutes. Sets of genes with a shared function also shared similar patterns of GTF kinetic behavior. TFIIE, a GTF that enters the PIC late in the assembly process, had residence times correlated with RNA synthesis rates. The datasets and results reported here provide kinetic information for most of the Pol II-driven genes in this organism, offering a rich resource for exploring the mechanistic relationships between PIC assembly, gene regulation, and transcription.

2.
Front Immunol ; 15: 1380641, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38601144

RESUMO

Recent studies have demonstrated a role for Ten-Eleven Translocation-2 (TET2), an epigenetic modulator, in regulating germinal center formation and plasma cell differentiation in B-2 cells, yet the role of TET2 in regulating B-1 cells is largely unknown. Here, B-1 cell subset numbers, IgM production, and gene expression were analyzed in mice with global knockout of TET2 compared to wildtype (WT) controls. Results revealed that TET2-KO mice had elevated numbers of B-1a and B-1b cells in their primary niche, the peritoneal cavity, as well as in the bone marrow (B-1a) and spleen (B-1b). Consistent with this finding, circulating IgM, but not IgG, was elevated in TET2-KO mice compared to WT. Analysis of bulk RNASeq of sort purified peritoneal B-1a and B-1b cells revealed reduced expression of heavy and light chain immunoglobulin genes, predominantly in B-1a cells from TET2-KO mice compared to WT controls. As expected, the expression of IgM transcripts was the most abundant isotype in B-1 cells. Yet, only in B-1a cells there was a significant increase in the proportion of IgM transcripts in TET2-KO mice compared to WT. Analysis of the CDR3 of the BCR revealed an increased abundance of replicated CDR3 sequences in B-1 cells from TET2-KO mice, which was more clearly pronounced in B-1a compared to B-1b cells. V-D-J usage and circos plot analysis of V-J combinations showed enhanced usage of VH11 and VH12 pairings. Taken together, our study is the first to demonstrate that global loss of TET2 increases B-1 cell number and IgM production and reduces CDR3 diversity, which could impact many biological processes and disease states that are regulated by IgM.


Assuntos
Subpopulações de Linfócitos B , Camundongos , Animais , Subpopulações de Linfócitos B/metabolismo , Linfócitos B , Cadeias Leves de Imunoglobulina/genética , Translocação Genética , Imunoglobulina M , Contagem de Células
3.
Genes (Basel) ; 15(2)2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38397134

RESUMO

Characterization of gene regulatory mechanisms in cancer is a key task in cancer genomics. CCCTC-binding factor (CTCF), a DNA binding protein, exhibits specific binding patterns in the genome of cancer cells and has a non-canonical function to facilitate oncogenic transcription programs by cooperating with transcription factors bound at flanking distal regions. Identification of DNA sequence features from a broad genomic region that distinguish cancer-specific CTCF binding sites from regular CTCF binding sites can help find oncogenic transcription factors in a cancer type. However, the presence of long DNA sequences without localization information makes it difficult to perform conventional motif analysis. Here, we present DNAResDualNet (DARDN), a computational method that utilizes convolutional neural networks (CNNs) for predicting cancer-specific CTCF binding sites from long DNA sequences and employs DeepLIFT, a method for interpretability of deep learning models that explains the model's output in terms of the contributions of its input features. The method is used for identifying DNA sequence features associated with cancer-specific CTCF binding. Evaluation on DNA sequences associated with CTCF binding sites in T-cell acute lymphoblastic leukemia (T-ALL) and other cancer types demonstrates DARDN's ability in classifying DNA sequences surrounding cancer-specific CTCF binding from control constitutive CTCF binding and identifying sequence motifs for transcription factors potentially active in each specific cancer type. We identify potential oncogenic transcription factors in T-ALL, acute myeloid leukemia (AML), breast cancer (BRCA), colorectal cancer (CRC), lung adenocarcinoma (LUAD), and prostate cancer (PRAD). Our work demonstrates the power of advanced machine learning and feature discovery approach in finding biologically meaningful information from complex high-throughput sequencing data.


Assuntos
Aprendizado Profundo , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Humanos , Fator de Ligação a CCCTC/genética , Fator de Ligação a CCCTC/metabolismo , DNA/genética , Fatores de Transcrição/metabolismo
4.
bioRxiv ; 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37546819

RESUMO

Background: A great deal of work has revealed in structural detail the components of the machinery responsible for mRNA gene transcription initiation. These include the general transcription factors (GTFs), which assemble at promoters along with RNA Polymerase II (Pol II) to form a preinitiation complex (PIC) aided by the activities of cofactors and site-specific transcription factors (TFs). However, less well understood are the in vivo PIC assembly pathways and their kinetics, an understanding of which is vital for determining on a mechanistic level how rates of in vivo RNA synthesis are established and how cofactors and TFs impact them. Results: We used competition ChIP to obtain genome-scale estimates of the residence times for five GTFs: TBP, TFIIA, TFIIB, TFIIE and TFIIF in budding yeast. While many GTF-chromatin interactions were short-lived (< 1 min), there were numerous interactions with residence times in the several minutes range. Sets of genes with a shared function also shared similar patterns of GTF kinetic behavior. TFIIE, a GTF that enters the PIC late in the assembly process, had residence times correlated with RNA synthesis rates. Conclusions: The datasets and results reported here provide kinetic information for most of the Pol II-driven genes in this organism and therefore offer a rich resource for exploring the mechanistic relationships between PIC assembly, gene regulation, and transcription. The relationships between gene function and GTF dynamics suggest that shared sets of TFs tune PIC assembly kinetics to ensure appropriate levels of expression.

5.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37540223

RESUMO

MOTIVATION: The rapid advance in single-cell RNA sequencing (scRNA-seq) technology over the past decade has provided a rich resource of gene expression profiles of single cells measured on patients, facilitating the study of many biological questions at the single-cell level. One intriguing research is to study the single cells which play critical roles in the phenotypes of patients, which has the potential to identify those cells and genes driving the disease phenotypes. To this end, deep learning models are expected to well encode the single-cell information and achieve precise prediction of patients' phenotypes using scRNA-seq data. However, we are facing critical challenges in designing deep learning models for classifying patient samples due to (i) the samples collected in the same dataset contain a variable number of cells-some samples might only have hundreds of cells sequenced while others could have thousands of cells, and (ii) the number of samples available is typically small and the expression profile of each cell is noisy and extremely high-dimensional. Moreover, the black-box nature of existing deep learning models makes it difficult for the researchers to interpret the models and extract useful knowledge from them. RESULTS: We propose a prototype-based and cell-informed model for patient phenotype classification, termed ProtoCell4P, that can alleviate problems of the sample scarcity and the diverse number of cells by leveraging the cell knowledge with representatives of cells (called prototypes), and precisely classify the patients by adaptively incorporating information from different cells. Moreover, this classification process can be explicitly interpreted by identifying the key cells for decision making and by further summarizing the knowledge of cell types to unravel the biological nature of the classification. Our approach is explainable at the single-cell resolution which can identify the key cells in each patient's classification. The experimental results demonstrate that our proposed method can effectively deal with patient classifications using single-cell data and outperforms the existing approaches. Furthermore, our approach is able to uncover the association between cell types and biological classes of interest from a data-driven perspective. AVAILABILITY AND IMPLEMENTATION: https://github.com/Teddy-XiongGZ/ProtoCell4P.


Assuntos
Análise de Célula Única , Análise da Expressão Gênica de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Redes Neurais de Computação , Transcriptoma , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
6.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36864611

RESUMO

MOTIVATION: Despite the success of recent machine learning algorithms' applications to survival analysis, their black-box nature hinders interpretability, which is arguably the most important aspect. Similarly, multi-omics data integration for survival analysis is often constrained by the underlying relationships and correlations that are rarely well understood. The goal of this work is to alleviate the interpretability problem in machine learning approaches for survival analysis and also demonstrate how multi-omics data integration improves survival analysis and pathway enrichment. We use meta-learning, a machine-learning algorithm that is trained on a variety of related datasets and allows quick adaptations to new tasks, to perform survival analysis and pathway enrichment on pan-cancer datasets. In recent machine learning research, meta-learning has been effectively used for knowledge transfer among multiple related datasets. RESULTS: We use meta-learning with Cox hazard loss to show that the integration of TCGA pan-cancer data increases the performance of survival analysis. We also apply advanced model interpretability method called DeepLIFT (Deep Learning Important FeaTures) to show different sets of enriched pathways for multi-omics and transcriptomics data. Our results show that multi-omics cancer survival analysis enhances performance compared with using transcriptomics or clinical data alone. Additionally, we show a correlation between variable importance assignment from DeepLIFT and gene coenrichment, suggesting that genes with higher and similar contribution scores are more likely to be enriched together in the same enrichment sets. AVAILABILITY AND IMPLEMENTATION: https://github.com/berkuva/TCGA-omics-integration.


Assuntos
Multiômica , Neoplasias , Humanos , Algoritmos , Neoplasias/genética , Perfilação da Expressão Gênica , Aprendizado de Máquina
7.
Ann Surg ; 278(3): e589-e597, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36538614

RESUMO

OBJECTIVE: Develop a predictive model to identify patients with 1 pathologic lymph node (pLN) versus >1 pLN using machine learning applied to gene expression profiles and clinical data as input variables. BACKGROUND: Standard management for clinically detected melanoma lymph node metastases is complete therapeutic LN dissection (TLND). However, >40% of patients with a clinically detected melanoma lymph node will only have 1 pLN on final review. Recent data suggest that targeted excision of just the single enlarged LN may provide excellent regional control, with less morbidity than TLND. The selection of patients for less morbid surgery requires accurate identification of those with only 1 pLN. METHODS: The Cancer Genome Atlas database was used to identify patients who underwent TLND for melanoma. Pathology reports in The Cancer Genome Atlas were reviewed to identify the number of pLNs. Patients were included for machine learning analyses if RNA sequencing data were available from a pLN. After feature selection, the top 20 gene expression and clinical input features were used to train a ridge logistic regression model to predict patients with 1 pLN versus >1 pLN using 10-fold cross-validation on 80% of samples. The model was then tested on the remaining holdout samples. RESULTS: A total of 153 patients met inclusion criteria: 64 with one pLN (42%) and 89 with >1 pLNs (58%). Feature selection identified 1 clinical (extranodal extension) and 19 gene expression variables used to predict patients with 1 pLN versus >1 pLN. The ridge logistic regression model identified patient groups with an accuracy of 90% and an area under the receiver operating characteristic curve of 0.97. CONCLUSIONS: Gene expression profiles together with clinical variables can distinguish melanoma metastasis patients with 1 pLN versus >1 pLN. Future models trained using positron emission tomography/computed tomography imaging, gene expression, and relevant clinical variables may further improve accuracy and may predict patients who can be managed with a targeted LN excision rather than a complete TLND.


Assuntos
Melanoma , Neoplasias Cutâneas , Humanos , Metástase Linfática/patologia , Neoplasias Cutâneas/genética , Neoplasias Cutâneas/cirurgia , Neoplasias Cutâneas/patologia , Melanoma/genética , Melanoma/cirurgia , Melanoma/patologia , Linfonodos/patologia , Tomada de Decisões , Excisão de Linfonodo , Estudos Retrospectivos
8.
Biomolecules ; 12(11)2022 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-36358927

RESUMO

Quantum computing holds great promise for a number of fields including biology and medicine. A major application in which quantum computers could yield advantage is machine learning, especially kernel-based approaches. A recent method termed quantum metric learning, in which a quantum embedding which maximally separates data into classes is learned, was able to perfectly separate ant and bee image training data. The separation is achieved with an intrinsically quantum objective function and the overall approach was shown to work naturally as a hybrid classical-quantum computation enabling embedding of high dimensional feature data into a small number of qubits. However, the ability of the trained classifier to predict test sample data was never assessed. We assessed the performance of quantum metric learning on test ants and bees image data as well as breast cancer clinical data. We applied the original approach as well as variants in which we performed principal component analysis (PCA) on the feature data to reduce its dimensionality for quantum embedding, thereby limiting the number of model parameters. If the degree of dimensionality reduction was limited and the number of model parameters was constrained to be far less than the number of training samples, we found that quantum metric learning was able to accurately classify test data.


Assuntos
Algoritmos , Metodologias Computacionais , Animais , Abelhas , Teoria Quântica , Inteligência Artificial , Aprendizado de Máquina
9.
Circ Res ; 130(9): 1345-1361, 2022 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-35369706

RESUMO

BACKGROUND: DYRK1a (dual-specificity tyrosine phosphorylation-regulated kinase 1a) contributes to the control of cycling cells, including cardiomyocytes. However, the effects of inhibition of DYRK1a on cardiac function and cycling cardiomyocytes after myocardial infarction (MI) remain unknown. METHODS: We investigated the impacts of pharmacological inhibition and conditional genetic ablation of DYRK1a on endogenous cardiomyocyte cycling and left ventricular systolic function in ischemia-reperfusion (I/R) MI using αMHC-MerDreMer-Ki67p-RoxedCre::Rox-Lox-tdTomato-eGFP (RLTG) (denoted αDKRC::RLTG) and αMHC-Cre::Fucci2aR::DYRK1aflox/flox mice. RESULTS: We observed that harmine, an inhibitor of DYRK1a, improved left ventricular ejection fraction (39.5±1.6% and 29.1±1.6%, harmine versus placebo, respectively), 2 weeks after I/R MI. Harmine also increased cardiomyocyte cycling after I/R MI in αDKRC::RLTG mice, 10.8±1.5 versus 24.3±2.6 enhanced Green Fluorescent Protein (eGFP)+ cardiomyocytes, placebo versus harmine, respectively, P=1.0×10-3. The effects of harmine on left ventricular ejection fraction were attenuated in αDKRC::DTA mice that expressed an inducible diphtheria toxin in adult cycling cardiomyocytes. The conditional cardiomyocyte-specific genetic ablation of DYRK1a in αMHC-Cre::Fucci2aR::DYRK1aflox/flox (denoted DYRK1a k/o) mice caused cardiomyocyte hyperplasia at baseline (210±28 versus 126±5 cardiomyocytes per 40× field, DYRK1a k/o versus controls, respectively, P=1.7×10-2) without changes in cardiac function compared with controls, or compensatory changes in the expression of other DYRK isoforms. After I/R MI, DYRK1a k/o mice had improved left ventricular function (left ventricular ejection fraction 41.8±2.2% and 26.4±0.8%, DYRK1a k/o versus control, respectively, P=3.7×10-2). RNAseq of cardiomyocytes isolated from αMHC-Cre::Fucci2aR::DYRK1aflox/flox and αMHC-Cre::Fucci2aR mice after I/R MI or Sham surgeries identified enrichment in mitotic cell cycle genes in αMHC-Cre::Fucci2aR::DYRK1aflox/flox compared with αMHC-Cre::Fucci2aR. CONCLUSIONS: The pharmacological inhibition or cardiomyocyte-specific ablation of DYRK1a caused baseline hyperplasia and improved cardiac function after I/R MI, with an increase in cell cycle gene expression, suggesting the inhibition of DYRK1a may serve as a therapeutic target to treat MI.


Assuntos
Infarto do Miocárdio , Miócitos Cardíacos , Animais , Modelos Animais de Doenças , Harmina/metabolismo , Harmina/farmacologia , Hiperplasia/metabolismo , Camundongos , Infarto do Miocárdio/metabolismo , Miócitos Cardíacos/metabolismo , Volume Sistólico , Função Ventricular Esquerda
10.
J Transl Med ; 19(1): 371, 2021 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-34454518

RESUMO

BACKGROUND: Immune cells in the tumor microenvironment have prognostic value. In preclinical models, recruitment and infiltration of these cells depends on immune cell homing (ICH) genes such as chemokines, cell adhesion molecules, and integrins. We hypothesized ICH ligands CXCL9-11 and CCL2-5 would be associated with intratumoral T-cells, while CXCL13 would be more associated with B-cell infiltrates. METHODS: Samples of human melanoma were submitted for gene expression analysis and immune cells identified by immunohistochemistry. Associations between the two were evaluated with unsupervised hierarchical clustering using correlation matrices from Spearman rank tests. Univariate analysis performed Mann-Whitney tests. RESULTS: For 119 melanoma specimens, analysis of 78 ICH genes revealed association among genes with nonspecific increase of multiple immune cell subsets: CD45+, CD8+ and CD4+ T-cells, CD20+ B-cells, CD138+ plasma cells, and CD56+ NK-cells. ICH genes most associated with these infiltrates included ITGB2, ITGAL, CCL19, CXCL13, plus receptor/ligand pairs CXCL9 and CXCL10 with CXCR3; CCL4 and CCL5 with CCR5. This top ICH gene expression signature was also associated with genes representing immune-activation and effector function. In contrast, CD163+ M2-macrophages was weakly associated with a different ICH gene signature. CONCLUSION: These data do not support our hypothesis that each immune cell subset is uniquely associated with specific ICH genes. Instead, a larger set of ICH genes identifies melanomas with concordant infiltration of B-cell and T-cell lineages, while CD163+ M2-macrophage infiltration suggesting alternate mechanisms for their recruitment. Future studies should explore the extent ICH gene signature contributes to tertiary lymphoid structures or cross-talk between homing pathways.


Assuntos
Antígenos CD , Melanoma , Antígenos CD/genética , Antígenos de Diferenciação Mielomonocítica , Humanos , Subpopulações de Linfócitos , Macrófagos , Melanoma/genética , Receptores de Superfície Celular , Microambiente Tumoral
11.
Sci Rep ; 11(1): 10826, 2021 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-34031486

RESUMO

Head and neck cancer is the sixth most common cancer worldwide with a 5-year survival of only 65%. Targeting compensatory signaling pathways may improve therapeutic responses and combat resistance. Utilizing reverse phase protein arrays (RPPA) to assess the proteome and explore mechanisms of synergistic growth inhibition in HNSCC cell lines treated with IGF1R and Src inhibitors, BMS754807 and dasatinib, respectively, we identified focal adhesion signaling as a critical node. Focal Adhesion Kinase (FAK) and Paxillin phosphorylation were decreased as early as 15 min after treatment, and treatment with a FAK inhibitor, PF-562,271, was sufficient to decrease viability in vitro. Treatment of 3D spheroids demonstrated robust cytotoxicity suggesting that the combination of BMS754807 and dasatinib is effective in multiple experimental models. Furthermore, treatment with BMS754807 and dasatinib significantly decreased cell motility, migration, and invasion in multiple HNSCC cell lines. Most strikingly, treatment with BMS754807 and dasatinib, or a FAK inhibitor alone, significantly increased cleaved-PARP in human ex-vivo HNSCC patient tissues demonstrating a potential clinical utility for targeting FAK or the combined targeting of the IGF1R with Src. This ex-vivo result further confirms FAK as a vital signaling node of this combinatorial treatment and demonstrates therapeutic potential for targeting FAK in HNSCC patients.


Assuntos
Dasatinibe/farmacologia , Quinase 1 de Adesão Focal/metabolismo , Neoplasias de Cabeça e Pescoço/metabolismo , Indóis/farmacologia , Pirazóis/farmacologia , Carcinoma de Células Escamosas de Cabeça e Pescoço/metabolismo , Sulfonamidas/farmacologia , Triazinas/farmacologia , Linhagem Celular Tumoral , Movimento Celular/efeitos dos fármacos , Proliferação de Células/efeitos dos fármacos , Sobrevivência Celular/efeitos dos fármacos , Sinergismo Farmacológico , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Neoplasias de Cabeça e Pescoço/tratamento farmacológico , Humanos , Paxilina/metabolismo , Fosforilação/efeitos dos fármacos , Análise Serial de Proteínas , Transdução de Sinais/efeitos dos fármacos , Carcinoma de Células Escamosas de Cabeça e Pescoço/tratamento farmacológico
14.
J Pediatr Surg ; 56(2): 286-292, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32682541

RESUMO

PURPOSE: Hepatoblastoma is the most common liver malignancy in children. In order to advance therapy against hepatoblastoma, novel immunologic targets and biomarkers are needed. Our purpose in this investigation is to examine hepatoblastoma transcriptomes for the expression of a class of genomic elements known as Human Endogenous Retrovirus (HERVs). HERVs are abundant in the human genome and are biologically active elements that have been associated with multiple malignancies and proposed as immunologic targets in a subset of tumors. A sub-family of HERVs, HERV-K(HML-2) (HERV-K), have been shown to be tightly regulated in fetal development, making investigation of these elements in pediatric tumors paramount. METHODS: We first created a HERVK-FASTA file utilizing 91 previously described HML-2 proviruses. We then concatenated the file onto the GRCh38.95 cDNA library from Ensembl. We used this reference database to evaluate existing RNA-seq data from 10 hepatoblastoma tumors and 3 normal liver controls (GEO accession ID: GSE8977575). Quantification and differential proviral expression analysis between hepatoblastoma and normal liver controls was performed using the pseudo-alignment program Salmon and DESeq2, respectively. RESULTS: HERV-K mRNA was expressed in hepatoblastoma from multiple proviral loci. All expressed HERV-K proviral loci were upregulated in hepatoblastoma compared to normal liver controls. Five HERV-K proviruses (1q21.3, 3q27.2, 7q22.2, 12q24.33 and 17p13.1) were significantly differentially expressed (p-adjusted value <0.05, |log2 fold change| > 1.5) across conditions. The provirus at 17p13.1 had an approximately 300-fold increased expression in hepatoblastoma as compared to normal liver. This was in part due to the near absence of HERV-K mRNA at the 17p13.1 locus in fully differentiated liver samples. CONCLUSIONS: Our investigation demonstrates that HERV-K is expressed from multiple loci in hepatoblastoma and that the expression is increased for several proviruses compared to normal liver controls. Our results suggest that HERV-K mRNA expression may be useful as a biomarker in hepatoblastoma, given the large differential expression profiles in hepatoblastoma, with very low mRNA levels in liver control samples.


Assuntos
Retrovirus Endógenos , Hepatoblastoma , Neoplasias Hepáticas , Biomarcadores , Criança , Retrovirus Endógenos/genética , Hepatoblastoma/genética , Humanos , Imunoterapia , Neoplasias Hepáticas/genética , RNA Mensageiro/genética , Regulação para Cima
15.
Genes (Basel) ; 11(12)2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33271747

RESUMO

Background: Machine learning (ML) has emerged as a powerful approach for predicting outcomes based on patterns and inferences. Improving prediction of severe coronary artery disease (CAD) has the potential for personalizing prevention and treatment strategies and for identifying individuals that may benefit from cardiac catheterization. We developed a novel ML approach combining traditional cardiac risk factors (CRF) with a single nucleotide polymorphism (SNP) in a gene associated with human CAD (ID3 rs11574) to enhance prediction of CAD severity; Methods: ML models incorporating CRF along with ID3 genotype at rs11574 were evaluated. The most predictive model, a deep neural network, was used to classify patients into high (>32) and low level (≤32) Gensini severity score. This model was trained on 325 and validated on 82 patients. Prediction performance of the model was summarized by a confusion matrix and area under the receiver operating characteristics curve (ROC-AUC); and Results: Our neural network predicted severity score with 81% and 87% accuracy for the low and the high groups respectively with an ROC-AUC of 0.84 for 82 patients in the test group. The addition of ID3 rs11574 to CRF significantly enhanced prediction accuracy from 65% to 81% in the low group, and 72% to 84% in the high group. Age, high-density lipoprotein (HDL), and systolic blood pressure were the top 3 contributors in predicting severity score; Conclusions: Our neural network including ID3 rs11574 improved prediction of CAD severity over use of Framingham score, which may potentially be helpful for clinical decision making in patients at increased risk of complications from coronary angiography.


Assuntos
Doença da Artéria Coronariana/genética , Polimorfismo de Nucleotídeo Único/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Coortes , Feminino , Genótipo , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Curva ROC , Medição de Risco/métodos , Fatores de Risco , Índice de Gravidade de Doença
16.
Genome Biol ; 21(1): 240, 2020 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-32894181

RESUMO

A key challenge in epigenetics is to determine the biological significance of epigenetic variation among individuals. We present Coordinate Covariation Analysis (COCOA), a computational framework that uses covariation of epigenetic signals across individuals and a database of region sets to annotate epigenetic heterogeneity. COCOA is the first such tool for DNA methylation data and can also analyze any epigenetic signal with genomic coordinates. We demonstrate COCOA's utility by analyzing DNA methylation, ATAC-seq, and multi-omic data in supervised and unsupervised analyses, showing that COCOA provides new understanding of inter-sample epigenetic variation. COCOA is available on Bioconductor ( http://bioconductor.org/packages/COCOA ).


Assuntos
Epigênese Genética , Epigenômica/métodos , Heterogeneidade Genética , Software , Neoplasias da Mama/genética , Metilação de DNA , Humanos , Anotação de Sequência Molecular
17.
Quantum Mach Intell ; 2(1): 1-26, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32879908

RESUMO

Motivated by the problem of classifying individuals with a disease versus controls using a functional genomic attribute as input, we present relatively efficient general purpose inner product-based kernel classifiers to classify the test as a normal or disease sample. We encode each training sample as a string of 1 s (presence) and 0 s (absence) representing the attribute's existence across ordered physical blocks of the subdivided genome. Having binary-valued features allows for highly efficient data encoding in the computational basis for classifiers relying on binary operations. Given that a natural distance between binary strings is Hamming distance, which shares properties with bit-string inner products, our two classifiers apply different inner product measures for classification. The active inner product (AIP) is a direct dot product-based classifier whereas the symmetric inner product (SIP) classifies upon scoring correspondingly matching genomic attributes. SIP is a strongly Hamming distance-based classifier generally applicable to binary attribute-matching problems whereas AIP has general applications as a simple dot product-based classifier. The classifiers implement an inner product between N = 2 n dimension test and train vectors using n Fredkin gates while the training sets are respectively entangled with the class-label qubit, without use of an ancilla. Moreover, each training class can be composed of an arbitrary number m of samples that can be classically summed into one input string to effectively execute all test-train inner products simultaneously. Thus, our circuits require the same number of qubits for any number of training samples and are O ( log N ) in gate complexity after the states are prepared. Our classifiers were implemented on ibmqx2 (IBM-Q-team 2019b) and ibmq_16_melbourne (IBM-Q-team 2019a). The latter allowed encoding of 64 training features across the genome.

18.
Data Brief ; 31: 105895, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32637500

RESUMO

Human Endogenous Retroviruses are a class of genomic elements that are the result of ancient retroviral infection of the human germline. Many are biologically active elements that have been implicated in multiple diseases including cancer. The most recent class to invade the human genome is the HERV-K(HML-2) (HERV-K) family. Approximately 90 HERV-K proviruses and many smaller elements have been identified to date in the human genome. Additional proviruses are continually being discovered with the rapid advancement of deep-sequencing and long-read sequencing technologies. HERV-K proviruses are poorly annotated in human transcriptome databases making their analysis in RNA-seq data difficult. To enable analysis, we compiled the sequences of 91 HERV-K proviruses identified in NCBI GenBank (ID JN675007-JN675097) and created a proviral alignment tool for visualizing RNA-seq reads aligned across individual proviruses. This allowed us to analyse publicly available RNA-seq data from 10 hepatoblastoma samples and 3 normal liver controls (GEO Accession ID: GSE89775). This data report includes the raw FASTA sequence files of the HERV-K proviruses from NCBI, a differential gene expression list between hepatoblastoma samples, and genomic alignment figures from 5 HERV-K proviruses identified as differentially expressed in the companion research article "Upregulation of Human Endogenous Retrovirus-K (HML-2) mRNAs in hepatoblastoma: Identification of potential new immunotherapeutic targets and biomarkers [1]. The data provided here are available for other research groups interested in evaluating individual HERV-K proviral expression using RNA-seq data. Furthermore, the data analysis is highly flexible and will accommodate the addition of other HERV-K proviruses.

19.
Circulation ; 142(21): 2045-2059, 2020 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-32674599

RESUMO

BACKGROUND: Rupture and erosion of advanced atherosclerotic lesions with a resultant myocardial infarction or stroke are the leading worldwide cause of death. However, we have a limited understanding of the identity, origin, and function of many cells that make up late-stage atherosclerotic lesions, as well as the mechanisms by which they control plaque stability. METHODS: We conducted a comprehensive single-cell RNA sequencing of advanced human carotid endarterectomy samples and compared these with single-cell RNA sequencing from murine microdissected advanced atherosclerotic lesions with smooth muscle cell (SMC) and endothelial lineage tracing to survey all plaque cell types and rigorously determine their origin. We further used chromatin immunoprecipitation sequencing (ChIP-seq), bulk RNA sequencing, and an innovative dual lineage tracing mouse to understand the mechanism by which SMC phenotypic transitions affect lesion pathogenesis. RESULTS: We provide evidence that SMC-specific Klf4- versus Oct4-knockout showed virtually opposite genomic signatures, and their putative target genes play an important role regulating SMC phenotypic changes. Single-cell RNA sequencing revealed remarkable similarity of transcriptomic clusters between mouse and human lesions and extensive plasticity of SMC- and endothelial cell-derived cells including 7 distinct clusters, most negative for traditional markers. In particular, SMC contributed to a Myh11-, Lgals3+ population with a chondrocyte-like gene signature that was markedly reduced with SMC-Klf4 knockout. We observed that SMCs that activate Lgals3 compose up to two thirds of all SMC in lesions. However, initial activation of Lgals3 in these cells does not represent conversion to a terminally differentiated state, but rather represents transition of these cells to a unique stem cell marker gene-positive, extracellular matrix-remodeling, "pioneer" cell phenotype that is the first to invest within lesions and subsequently gives rise to at least 3 other SMC phenotypes within advanced lesions, including Klf4-dependent osteogenic phenotypes likely to contribute to plaque calcification and plaque destabilization. CONCLUSIONS: Taken together, these results provide evidence that SMC-derived cells within advanced mouse and human atherosclerotic lesions exhibit far greater phenotypic plasticity than generally believed, with Klf4 regulating transition to multiple phenotypes including Lgals3+ osteogenic cells likely to be detrimental for late-stage atherosclerosis plaque pathogenesis.


Assuntos
Aterosclerose/genética , Aterosclerose/patologia , Fatores de Transcrição Kruppel-Like/genética , Miócitos de Músculo Liso/patologia , Fator 3 de Transcrição de Octâmero/genética , Células-Tronco Pluripotentes/patologia , Animais , Feminino , Humanos , Fator 4 Semelhante a Kruppel , Masculino , Camundongos , Camundongos Knockout , Fenótipo , Análise de Sequência de RNA/métodos
20.
J Biol Chem ; 295(12): 3990-4000, 2020 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-32029477

RESUMO

DNA double-stranded breaks (DSBs) are strongly associated with active transcription, and promoter-proximal pausing of RNA polymerase II (Pol II) is a critical step in transcriptional regulation. Mapping the distribution of DSBs along actively expressed genes and identifying the location of DSBs relative to pausing sites can provide mechanistic insights into transcriptional regulation. Using genome-wide DNA break mapping/sequencing techniques at single-nucleotide resolution in human cells, we found that DSBs are preferentially located around transcription start sites of highly transcribed and paused genes and that Pol II promoter-proximal pausing sites are enriched in DSBs. We observed that DSB frequency at pausing sites increases as the strength of pausing increases, regardless of whether the pausing sites are near or far from annotated transcription start sites. Inhibition of topoisomerase I and II by camptothecin and etoposide treatment, respectively, increased DSBs at the pausing sites as the concentrations of drugs increased, demonstrating the involvement of topoisomerases in DSB generation at the pausing sites. DNA breaks generated by topoisomerases are short-lived because of the religation activity of these enzymes, which these drugs inhibit; therefore, the observation of increased DSBs with increasing drug doses at pausing sites indicated active recruitment of topoisomerases to these sites. Furthermore, the enrichment and locations of DSBs at pausing sites were shared among different cell types, suggesting that Pol II promoter-proximal pausing is a common regulatory mechanism. Our findings support a model in which topoisomerases participate in Pol II promoter-proximal pausing and indicated that DSBs at pausing sites contribute to transcriptional activation.


Assuntos
Quebras de DNA de Cadeia Dupla , RNA Polimerase II/metabolismo , Camptotecina/metabolismo , Camptotecina/farmacologia , Quebras de DNA de Cadeia Dupla/efeitos dos fármacos , DNA Topoisomerases Tipo I/química , DNA Topoisomerases Tipo I/metabolismo , DNA Topoisomerases Tipo II/química , DNA Topoisomerases Tipo II/metabolismo , Etoposídeo/metabolismo , Etoposídeo/farmacologia , Células HeLa , Humanos , Sítio de Iniciação de Transcrição , Ativação Transcricional/efeitos dos fármacos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...